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II  ABSTRACT 


A  brief  discussion  of  basic  encoding  and  decoding  on  noisy  channels  is 
presented  to  provide  a  background  for  the  experimental  portion  of  this  research. 

A  partitioned  3  state  Gilbert  model  is  used  to  model  a  Durst  channel  and  a 
method  of  calculating  error  sequence  probabilities  using  this  model  is  introduced. 

Error  sequence  probability  calculations  are  made  using  a  (7.3)  maximal 
length  code  and  a  (15,7)  BCH  code. 

Observations  are  made  about  the  general  type  of  decoding  rule  to  use  to 
give  the  lowest  probability  of  decoding  error  on  burst  channels  when  using  an 
interleaving  technique. 
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I.  INTRODUCTION 


A.  THE  COMMUNICATION  SYSTEM 
1 .  General 

A  general  communication  system  (Fig.  1)  contains  an  inform¬ 
ation  source,  an  encoder,  a  channel,  a  decoder,  and  a  destination.  The 
Information  source  selects  a  desired  message  from  a  set  of  possible 
messages.  The  source  encoder,  when  one  is  used,  compresses  the 
data  by  removing  inherent  redundance  in  the  source  output  so  as  to 
make  each  possible  output  equally  likely.  The  channel  encoder  re¬ 
introduces  redundancy  into  the  data  to  improve  the  reliability  of  trans¬ 
mission  over  the  channel. 

The  communication  channel  is  the  medium  of  conveyance  of 
information  from  the  source  location  to  the  destination  location.  The 
channel  decoder  uses  the  redundance  introduced  by  the  channel  encoder 
to  correct  errors  introduced  during  transmission. 

If  a  discrete  memoryless  source  whose  output  is  equally 
probable  binary  digits  is  assumed,  the  source  encoder  and  source  de¬ 
coder  are  no  longer  necessary  and  the  resulting  communication  system 
is  shown  in  Figure  2.  In  actual  practice  this  is  the  most  common  arrange¬ 
ment  for  the  communication  system  even  if  the  information  source  does 
not  produce  equally  probable  binary  digits. 
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If  a  communication  channel  is  noisy,  it  is  not  possible,  in 
general,  to  reconstruct  with  certainty  at  the  channel  decoder,  the  out¬ 
put  of  the  information  source.  Shannon  [l]]  ,  however,  did  show  that 

by  proper  encoding  the  probability  of  making  a  decoding  error  can  be 
made  arbitrarily  small  if  the  rate  of  data  transmitted  across  the  communi¬ 
cation  channel  does  not  exceed  a  maximum  vc'ne  known  as  the  channel 
capacity  C. 

The  capacity  of  a  channel,  in  general,  is  influenced  by  a 
number  of  factors.  The  number  of  channel  inputs  and  outputs,  and  the 
set  of  all  possible  transition  probabilities  from  the  inputs  to  the  outputs, 
all  affect  the  channel  capacity. 

2 .  Noise 

The  effect  of  channel  noise  is  to  introduce  the  possibility 
that  the  output  of  the  channel  may  differ  from  the  input  to  the  channel. 

The  particular  way  in  which  the  noise  affects  the  channel's  input  is 
determined  by  the  type  of  channel  and  the  type  of  channel  noise  encoun¬ 
tered.  Memoryless  channels,  which  are  often  used  as  theoretical  models, 
assume  that  all  digits  transmitted  over  the  channel  are  affected  independ¬ 
ently  by  channel  noise.  Unfortunately,  this  meinoryless  property  is 
rarely  found  in  real  channels,  an  important  exception  being  certain  deep 
space  channels . 

Errors  on  most  real  channels  tend  to  occur  in  groups  or 
bursts.  These  real  channels  are  thus  channels  with  memory  because  the 
probability  of  the  channel  changing  a  transmitted  digit  is  dependent  on 
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whether  the  channel  changed  the  previously  transmitted  digit.  The  cal¬ 
culation  of  the  probability  of  a  given  error  sequence  occurring  is  thus 
the  product  of  a  series  of  conditional  probabilities. 

R.  NOISY  CHANNEL  ENCODING  AND  DECODING 

If  noise  were  not  present  in  the  channel,  no  encoding  of  the 
source  output  would  be  needed  to  get  the  transmitted  message  to  its 
destination.  The  presence  of  noise,  however,  requires  sufficient 
redundancy  in  the  encoded  message  so  that  the  original  message  can 
be  recovered  at  the  decoder. 

For  binary  encoding  this  required  redundancy  can  be  accomplished 
by  using  block  codes  and  partitioning  the  input  sequences  into  blocks 
of  K  bits.  The  encoder  outputs  blocks  of  a  longer  length  (N  bits)  forming 
a  (N,K)  block  code.  The  encoder  thus  maps  the  set  of  2^  possible  K 
bit  sequences  (messages)  into  a  set  of  N  bit  sequences  called  code¬ 
words.  In  the  channel,  noise  may  be  present  and  the  input  tc  the  channel 
decoder  (Y)  may  differ  from  the  output  of  the  channel  encoder  (X).  The 
decoder  performs  the  mapping  of  all  possible  received  sequences  back 
Into  the  messages  most  likely  to  have  been  transmitted.  Since  the 
decoder  must  make  a  decision  as  to  which  message  was  transmitted  for 
a  given  received  sequence,  there  is  a  certain  probability  of  making  a 
decoding  error.  The  probability  of  the  decoder  making  an  error  is  largely 
dependent  on  the  mathematical  properties  of  the  type  of  code  used,  the 
type  of  decoding  used,  and  the  number  and  type  of  channel  errors 
encountered . 
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II.  CHANNEL  MODELS 


...  THE  BINARY  SYMMETRIC  CHANNEL 

The  simplest  model,  and  the  one  most  commonly  used  to  represent 
error  sequences,  Is  the  binary  symmetric  channel.  The  BSC  in  shown  in 
Figure  3.  The  transition  probabilities  are  assumed  to  be  constant  and 
are  not  dependent  on  the  previous  uses  of  the  channel.  The  maximum  rate 
at  which  information  can  be  reliably  transmitted  across  the  channel  is 
called  the  channels  capacity  C.  Shannon  [lj  and  others  have  shown 
that  the  capacity  of  the  binary  symmetric  channel  C  =  l-H(p),  where  the 
p  is  the  crossover  or  error  probability  of  the  BSC  as  shown  in  Figure  3. 

H(p)  is  railed  the  entropy  function  and  is  defined  as  H(p)  -  -p  loy^  P- (1-p) 

l"gz  H-p). 

In  information  theory,  the  binary  symmetric  channel  is  the  most 
often  used  model  of  a  communication  channel.  This  idealized  model  has 
been  shown  to  accurately  represent  some  deep  space  communication  links 
but  it  is  a  poor  model  for  most  real  communication  channels  encountered. 
Errors  on  real  channels  caused  by  lightning  interference  from  another  trans¬ 
mitter,  fading  propagation  paths,  and  many  other  natural  and  man¬ 
made  phenomena  tend  to  occur  in  groups  or  bursts. 

Bursty  channels  are  called  channels  with  memory  because  the  proba¬ 
bility  of  making  an  error  on  a  particular  digit  of  an  information  sequence 
is  greatly  Increased  if  an  ener  Is  made  on  the  preceding  digit.  The  memory 
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characteristic  of  most  real  channels  is  *he  reason  the  BSC  falls  as  an 
accurate  model. 

B.  THE  GILBERT  MODEL 

1 .  Introduction  and  Description 

A  simple  model  for  a  channel  with  memory  was  proposed  by 
E.  N.  Gilbert  [5]  •  The  pictorial  representation  of  the  Gilbert  model 

is  shown  in  Figure  4.  This  model  of  a  burst-noise  binary  channel  uses  a 
Markov  chain  with  two  states  called  G  and  B.  In  state  G  no  transmission 
errors  are  made  and  in  state  B  the  probability  of  making  an  error  is  h. 

The  parameters  h,  P,  and  p  are  assumed  constant.  The  probability  of 
making  an  error  on  any  digit  of  a  binary  sequence  is  dependent  on  the 
state  of  the  channel  when  that  digit  is  transmitted.  The  fraction  of  time 
spent  in  the  bursty  state  B,is  p/{P+p)  and  the  fraction  of  time  spent  in  the 
good  state  Is  P/(P+p) . 

2 .  The  3  State  Part’tloned  Gilbert  Model 
a .  Definition 

In  the  two  state  Gilbert  Model  when  in  the  bursty  sta*e 
B  an  error  may  or  may  not  occur.  If  the  bursty  state  B  is  partitioned  into 
an  error  free  state  Bq,  and  an  error  state  Bp  the  resulting  channel  model 
is  as  shown  plctcrially  in  Figure  6 .  The  model  now  consists  of  three 
states:  G,  Bq,  and  Bp  and  errors  occur  when  and  only  when  the  channel 
is  in  state  Bp 
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With  this  partitioned  model,  the  probability  of  a  digit 
of  a  transmitted  binary  sequence  being  in  error  is  dependent  on  the  state 
of  the  channel  and  the  transition  probability  from  that  state  to  the  Bj 
state.  The  state  of  the  channel  is  determined  by  the  previously  trans¬ 
mitted  digit.  T.f  the  previous  digit  was  in  error,  the  channel  is  in  state 
and  the  probability  of  a  digit  error  P*(s ;  =  />(/-/=)  •  If  the  previous  digit 
was  not  received  in  error,  the  channel  could  be  in  either  state  B  or  state 
Gand  Pn  (£)  ■=  P*  (  do  /nor  B/  )  h  ( I- P)  +  Pa  (g/a/o  t  Bi)  P  e  Pn  (Bo) 

is  the  probability  of  the  channel  being  in  state  Bq  and  P«(G)  is  the  proba¬ 
bility  of  the  channel  being  in  state  G.  In  a  similar  manner  it  is  possible 
to  calculate  the  probability  of  occurrence  of  a  complete  binary  error 
sequence,  using  this  model.  The  probability  of  a  digit  error  is  the  calcul¬ 
ation  of  the  probability  of  the  state  Bj.  To  calculate  the  probability  of  an 
error  sequence,  the  a  priori  state  probabilities  must  be  calculated  for  the 
given  model  parameters  P,  p,  and  h, 

b.  Calculation  of  a  priori  State  Probabilities 

Since  the  probability  of  occupancy  of  a  state  at  any 
digit  is  only  determined  by  the  previous  digit  and  the  transition  probabili¬ 
ties ,  the  state  probabilities  at  the  (k+1)  ST  digit  can  be  expressed  by  the 
following  equations: 

TTGUf/;  =  TTg(A)  f  p  (  JT  Bo  (A)  +  7Tb  i  (A)) 

IT  8.(^0  =  *>(/-/>)  TTg(A)  +(i-p)(/-h)  (  7 TbUA)  +  1Tb,  (A)) 

~TT 8i  {A+i)-hi°  TTq  (A)  +  (l~P)  h  (TT 8o(A)+ TTe.CA)) 


As  k-*-©©  .  the  probabilities  77G(k),  77,Bg(k),  and 

77"  B^(k)  approach  equilibrium  values  7TG,  77  B,,and  TT  B0  .  Thus, 


the  equations  reduce  to: 


TTg  =  (t-f>)  JTq  -h  p  {  77 Bo  +  TTb,) 

TTbo  -'p(l-h)  7Tg  -h  (t-P)(l-h)(lTB0+  7Tb,) 

TTb,  r  h^p  TTg  +  (/- P)  h  ( TTBo  +  1Tb,) 

Since  (iTSo-f  TTb,)  ~  (/-  77G)  it's  substitution 
yields 


TTg  = 


P 


-TT  _ 

'  t  So  -  P-h'P 


IT 


B, 


h'p 
p+  f> 


An  alternate  method  of  solution  is  to  observe  in  the 
original  Gilbert  model  that  the  fraction  of  time  spent  in  state  G  is  P/(P+p) 
and  the  fraction  of  time  spent  in  the  burst  state  B  is  p/(rrp).  The  fraction 
of  time  spent  in  state  Bj  is  che  burst  state  error  prooability  h  times  the 
fraction  of  time  in  the  burst  state  and  the  time  spent  in  state  B„is  thus 
( 1— h)  p/(P+p). 

c.  Calculation  of  Error  Sequence  Probability 

Given  any  binary  sequence,  the  probability  that  that 
sequence  is  the  error  sequence  of  a  Gilbert  channel  can  be  calculated  as 
follows: 


(1)  Initialize  by  choosing  values  TT  G(0), 

77"Bq(0),  7TBi(0).  (If  the  initial  state  of 
the  channel  is  unknown  or  not  specified,  a 
sensible  choice  is  to  initialize  to  the  state 
equilibrium  distributions:  7T«(o;=  7TS  »  etc.) 
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(2)  If  the  k  th  digit  is  in  error,  assign 
TTBa^A)-lTc,{A)  =  o.o 

TTb,(A)-=  h  o  7Ta(A-i)  +  {>~P)h  (jTso  < k-t)+  7Tbi(  A-i}) 

Assign  the  probability  of  the  sequence 
after  the  k  th  digit  PSEQ(k)  =  7 T  B^(k) 

(3)  If  the  k  th  digit  is  not  ir.  error,  assign 
77  B^k)  =  0 

TT Bo(A)-  f  (/-/>)  7ra(A-/)+(l-p)(l-h)(7TB.(>i-i)+  TTa.iA-i)) 

TTgIA)  -  (l-P)  TTq,{  A-i)  +  P  (TTB0U-i)+lTe,(A~i)) 

Assign  the  sequence  probability  after 
the  k  th  digit  PSEQ(k)  =  77  B0(k)  +  77  G(k) 

d.  Modeling  of  Real  Channels 

In  recent  years  many  channel  models  have  been  proposed 
to  characterize  the  performance  of  real  communications  channels.  Gilbert 
originally  proposed  the  simple  two  state  model  for  a  channel  with 
memory  and  had  limited  success  in  choosing  the  parameters  of  his  model 
to  produce  statistics  similar  to  given  finite  length  error  sequences.  Using 
this  model  it  is  impossible  to  reconstruct  the  sequence  of  states  from  a 
given  error  sequence  because  of  the  many  possible  sequences  of  states 
that  produce  the  same  given  error  sequence. 

Frltchman  [?J  extended  the  model  of  Gilbert  by 
studying  the  general  case  of  finite  state  models  with  k  error  free  states 
and  N-K  error  states.  Many  more  complex  models  have  been  developed  in 
attempts  to  accurately  represent  the  performance  of  real  channels.  The 
comparison  of  the  accuracy  of  a  developed  model  to  a  given  real  channel 
is  usually  done  by  performing  a  statistical  analysis  on  a  finite  data 
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sequence  from  the  channel  and  comparing  the  results  to  the  statistics  of 
data  produced  using  the  constructed  models.  Increasing  the  model's 
complexity  increases  the  number  of  possible  ways  the  model  can  generate 
a  particular  sequence  and  thus  reduces  the  chance  of  obtaining  accurate 
statistical  data  about  the  model. 


III.  CODING  ON  A  NOISY  CHANNEL 


A.  BLOCK  CODES 

1 .  General 

Block  codes  are  usually  specified  as  (N,K)  codes,  where  N 
is  the  number  of  codeword  digits  (block  length)  and  K  is  the  number  of 
Information  digits  in  a  codeword.  The  rate,  R,  of  the  code  is  the  ratio  of 
the  number  of  information  digits  in  a  codeword  to  the  total  number  of 
digits  in  the  codeword  (R=  K/N).  The  hamming  distance  be..ween  two  code¬ 
words  is  the  number  of  positions  in  which  the  digits  of  the  two  codewords 
differ.  The  hamming  weight  of  a  codeword  is  its  number  of  non-zero  com¬ 
ponents.  The  distance  between  two  blnary  codewords  is  the  hamming 
weight  of  their  difference.  The  distance  between  a  transmitted  codeword 
and  the  received  codeword  y  denoted  d(x^  ,  y) ,  is  the  number  of 
transmission  errors  occurring  in  the  channel. 

2 .  Error  Correction  Bounds  for  (N,K)  Block  Codes 
a.  Random  Error  Correction 

Let  cL  min  denote  the  minimum  distance  or  a  (N,K) 
block  code  (the  least  hamming  distance  between  codewords).  At  least 
two  codewords  differ  in  only  cL  min  of  their  N  positions.  It  has  been 
shown  by  Peterson  [2]  and  others  that  thee*'  block  codes  with  mini¬ 
mum  distance  cL  min  can  in  general  detect  cL  min  -  1  errors  or  correct 
{  d  min-l)/2  errors.  It  is  also  possible  to  decode  in  such  a  way  as  to 
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simultaneously  correct  t  or  fewer  errors  and  detect  m  or  fewer  errors 
if  and  only  if  cL  min  >  2t  +  m.  If  a  code  is  used  for  eiror  correction 
only  and  has  a  minimum  distance  of  cL  min,  cL  min  ;>2t,  the  code  is 
capable  of  correcting  t  or  fewer  random  errors. 

Two  codes  jan  have  the  same  miminum  distance  and  one 

of  the  codes  have  much  better  error  correcting  ability  because  of  its  capa- 
\ 

bility  to  correct  more  error  patterns  of  greater  weight  than  that  guaranteed 
by  its  minimum  distance.  Minimum  distance  alone  is  therefore  not  a  com¬ 
plete  measure  of  the  goodness  of  a  code. 

Gilbert  £8}  proved  that  for  any  \T  >  0  and  d  >  0 
such  that  -77"  ±  -jr  ,  there  exists  a  code  of  length  N  and  minimum  distance 
d.  min  2.  d  with  a  rate  R  2  1-H(^/N)  (where  H(  )  is  the  binary  entropy 
function).  This  bound,  known  as  the  Gilbert  Bound,  is  often  used  as  a 
measure  of  goodness  for  a  code.  Since  t  =  errors  can  be  cor¬ 

rected  by  a  code  with  a  minimum  distance  d,  the  Gilbert  bound  may  be 
expressed  as 

H(  -^  )  2  1-R 

b.  Burst  Error  Correction 

An  error  sequence  of  length  N  is  said  to  contain  a  burst 
of  length  t  if  all  non-zero  digits  are  confined  to  a  span  of  t  consecutive 
positions.  Since  a  burst  of  length  t  is  also  one  of  the  random  error 
patterns  of  weight  t,  it  is  clear  that  a  code  capable  of  correcting  any 
pattern  of  t  or  fewer  errors  is  also  capable  of  correcting  all  bursts  of 
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length  t  or  less.  Gallager  [jJ  has  shown  that  a  code  of  length  N 
and  rate  R  can  correct  all  bursts  of  length  t  or  less  only  If 

■w  ^  i(i-R) 

This  relation  is  known  as  the  Gallager  bound  and  it  can  be  shown  that  as 
the  block  length  N  approaches  infinity,  codes  exist  which  meet  the 
Gallager  bound . 

c.  Comparison  of  Burst  Correction  and  Random 
Error  Correction 

A  comparison  of  burst  and  random  error  correcting  capa¬ 
bilities  of  codes  can  be  obtained  by  comparing  the  Gilbert  and  Gallager 
bounds  as  shown  in  Figure  6.  Also  sketched  is  the  asymptotic  form  of  an 
upper  bound  on  random  error  correcting  capability  due  to  Plctkin  [9j 
The  bounds  show  that  as  N  approached  infinity,  the  length  of  correctible 
bursts  is  twice  the  weight  cf  correctible  random  error  patterns. 

B.  LINEAR  CODES 
1 .  General 

The  alphabet  of  two  symbols,  0  and  1,  under  modulo-2 
addition  and  multiplication  is  called  the  Galois  field  of  two  elements  (or 
binary  field)  and  is  usually  denoted  GF(2).  It  can  be  shown  that  for  any 
Integer  q  =  p71  ,  where  p  is  prime  and  n  2.  1,  a  Galois  field  of  q  elements 

exists.  This  field  is  usually  denoted  GF(q).  The  set  of  all  binary  N-tuples 
is  a  vector  space  over  GF(2)  of  dimension  N  under  the  operation  of  modulo- 
2  addition.  A  binary  code  is  called  linear  if  and  only  if  it  is  a  subspace 
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of  the  space  of  all  N-tuples.  Any  linear  combination  of  codewords  of  a 
linear  code  is  thus  also  a  codeword  of  the  linear  code.  Since  any  code¬ 
word  added  to  itself  is  a  codeword,  the  N  dimensional  null  vector  is  always 
a  codeword  of  any  linear  code. 

2 .  Generator  Matr  ix  G  and  Parity  Check  Matiix  H 

Any  set  of  basis  vectors  for  a  linear  codeword  set  V  can  be 
considered  as  rows  of  a  matrix  G  called  the  generator  matrix.  All  code¬ 
words  are  linear  combinations  of  the  rows  of  G.  If  the  dimension  of  V  is 
K,  the  number  of  rows  of  G  is  K  and  G  if  a  (KxN)  matrix.  Every  codeword 
x  in  the  codeword  set  V  can  be  generated  by  multiplying  the  matrix  G  by 
the  vector  u  where  u  is  one  of  the  set  of  2^  K-tuples  ,  called  messages 
(x  =  uG) . 

The  parity  check  matrix  H  for  a  linear  code  is  a  matrix  such 
that  for  any  x,  xH  T  =  0  if  and  only  if  x  is  in  V.  H  is  thus  a  ((N-K)x  N) 
matrix  of  rank  N-K. 

A  codeword  set  V  is  in  canonic  systematic  form  when  the  first 
K  digits  of  a  codeword  x  is  the  information  vector  u  used  to  generate  x. 

The  codeword  x  may  be  expressed  by  x  =  (a^,  . a^,  c^,  c?,  .... 

cN_K).  The  G  and  H  matrices  can  now  be  expressed,  by 

s  =  [i,i  p]  h-  L-pr'.  x«-«D 

(Ij^  denotes  a  Identity  matrix  of  order  K),  it  can  be  shown  that  any  linear 
code  can  be  put  in  canonic  systematic  form  after  a  proper  permutation  of  its 
codeword  positions. 
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3 .  Syndrome  Decoding 


The  syndrome  S  of  a  linear  code  may  be  defined  by  S  =  yH 
where  y  is  the  received  sequence  at  the  channel  decoder.  The  received 
sequence  y  may  be  expressed  as  y  =  x  +  e,  where  x  is  the  codeword  trans¬ 
mitted  and  e  is  the  error  sequence  generated  by  additive  noise  in  the 
channel.  S  =  y.  HT=  x  HT+  e  HT“  e_  HT  . 

(since  x  H  =  0  for  any  codeword  n.  ) 

Since  S  =  e  H  and  e  is  a  N  component  vector  and  H  is  a  N  *  (N-K) 
matrix  S^  is  a  vector  with  N-K  components . 

In  general  for  any  binary  (N-K)  linear  code  there  are  2^'*^ 
syndromes  and  each  of  these  syndromes  has  2^  possible  error  sequences 

T 

for  whicn  the  equat!on  S  =  e  H  is  satisfied.  If  the  decoder  is  constructed 
so  that  upon  receiving  an  input  y,  it  calculates  .S,  then  chooses  the  e* 
which  is  the  most  likely  of  the  2^  possible  error  sequences  of  S_,  maximum 
likelihood  decoding  can  be  Implemented  by  adding  the  e*to  y  to  yield  x*, 
the  codeword  most  likely  to  have  been  transmitted. 

If  =  0,  then  the  received  sequence  y  is  a  codeword  and  if 
S/0,  the  received  sequence  is  not  a  codeword  S  =  0  does  not  guarantee 
that  no  errors  were  made  in  transmission  since  x  +  e.  could  sum  to  a  code¬ 
word  but  S_  y  0  does  guarantee  that  some  errors  did  occur. 

One  alternative  decoding  method  to  maximum  likelihood  de¬ 
coding  is  to  calculate  the  syndrome  S  and  if  S  =  0,  accept  the  codeword 
as  received  and  if  S  ^  0,  request  a  retransmission  of  the  codeword.  The 
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main  disadvantages  to  this  system  are  the  additional  reliable  communica¬ 
tion  systems  neecbd  from  the  user  back  to  the  source  and  if  a  large  number 
of  errors  occur  ,  the  data  rate  is  greatly  reduced  and  a  large  buffer  may  be 
needed  to  maintain  symbol  synchronization. 

C.  CYCLIC  CODES 

1 .  Introduction 

a.  Definition 

As  defined  by  f.allager  [3]  ,  a  cyclic  code  overGF(q) 
is  a  linear  code  with  the  special  property  that  any  cyclic  shift  of  a  code¬ 
word  is  another  codeword.  That  is  ,  if  (a  i ,  a  2 ,  a  3 . a^)  is  a  code¬ 
word,  then  (a^,  a^,  a2,  . aN-P  is  a^so  a  codeword. 

b.  Generation  of  a  Cyclic  Linear  Code 

If  a  codeword  x  =  (xjj-i,  xN_2 ,  . Xj,  Xg)  it  may  be 

represented  by  a  polynomial  over  GF(q)  (a  Galois  field  of  q  elements). 

x(D)  =  D*  '  +  x„.,  d"  *  +  •  X,  D  +  x,  . 

If  x(D)  is  a  codeword  in  a  cyclic  code  (the  coefficients  form  the  letters  of 
a  codeword)  then  the  remainder  of  D  x(D)  modulo  D^  -1  is  also  a  codeword. 
Let  g(D)  be  the  lowest  degree  monic  polynomial  of  degree  m(m  =  N-K), 
which  is  a  codeword.  It  has  been  shown  that  for  any  polynomial  a(D)  in 
GF(q)  with  degree  at  most  K-l,  a(D)  g(D)  is  a  codeword.  The  polynomial 
g(D)  is  called  the  generator  polynomial  of  the  cyclic  code  and  all  code¬ 
words  contain  g(D)  as  a  factor.  The  set  of  codewords  is  the  set  of  linear 
combinations  of  g(D)  and  its  first  K-l  cyclic  shifts.  Any  (N-K)  degree 
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monic  polynomial  over  GF(q)  that  divides  -1  can  generate  a  cyclic 
code  with  K  information  digits  and  block  length  N. 

The  check  polynomial  h(D)  for  a  (N,K)  linear  cyclic 
code  is  defined  so  that  g(D)  h(D)  =  D^-l  and  h(D)  is  of  degree  K.  With 
the  check  polynomial  h(D)  so  defined,  it  may  be  shown  that,  as  for  the 

r 

parity  check  matrix  for  any  linear  code,  x  K  =  .0  if  and  only  if  x  is  a 
codeword . 

2 .  Maximal  Length  Codes 

a .  Definition 

A  linear  maximal  sequence  is  a  binary  sequence  gen¬ 
erated  by  a  linear  shift-register  generator  which  has  the  longest  possible 
period  for  this  generation  method.  The  longest  period,  L  =  2^-1,  where 
K  is  the  number  of  stages  in  the  shift-register  generator.  A  linear  code 
whose  codewords  are  maximal  length  binary  sequences  is  called  a  maximal 
length  code. 

b.  Generation 

A  linear  shift-register  generator  consists  of  a  basic 
shift  register  and  modulo-two  adders.  The  geneiotor  outputs  a  binary 
sequence  that  is  based  on  its  initial  input  and  the  feedback  connections 
to  the  modulo-two  adders.  The  binary  sequence  output  of  the  register  is 
of  maximal  length  when  the  feedback  connections  are  made  in  accordance 
with  a  primitive  polynomial  as  is  defined  by  Peterson  [2]  .  The  con¬ 

nections  also  correspond  to  the  parity  check  polynomial  h(D)  described  in 
the  previous  section. 
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As  described  by  Gallager  £3]  ,  given  h(D),  a  minimum 
polynomial  of  degree  m  of  a  primitive  element  in  any  representation  of 
GF(p'"’  ),  a  maximal  length  code  of  block  length  N  =  p/”'-l  can  be  generated 
by  an  m-stage  shift-register  encoder  circuit  as  shown  in  Figure  6.  One 
codeword  of  the  code  corresponds  to  tne  generator  polynomial  g(D)  and  the 
remaining  codewords  are  generated  by  N-l  cyclic  shifts  of  g(D). 

A  (7,3)  maximal  length  code  may  be  generated  using  a 
feedback  shift-register  whose  connections  correspond  to  a  primitive  poly¬ 
nomial  of  degree  3.  The  third  degree  primitive  polynomial  listed  in 
Peterson  [2]  ,  is  1  3  (octal  representation)  or  001011  (binary  representa¬ 
tion).  This  corresponds  to  h(D)  =  1  +  D  +  and  the  connections  to  a  feed¬ 
back  shift-register  to  generate  the  (7,3)  maximal  length  code  are  shown  in 
Figure  7. 

Maximal  length  codes  are  useful  because  they  are  easy 
to  generate  and  have  a  large  minimum  distance  for  their  block  length.  The 
(7,3)  maximal  length  code  has  a  block  length  of  7,  a  rate  of  3/7  and  a 
minimum  distance  of  4.  This  code  is  thus  able  to  correct  all  sinyle  errors 
and  many  double  error  patterns. 

3.  BCH  Codes 

a.  General 

The  Bose,  Chandhari,  and  Hocquenghen  (BCH)  codes 
were  first  discovered  in  1959.  These  codes  are  cyclic  codes  which  have 
powerful  error-correcting  properties  and  for  which  relatively  simple  decoding 
algorithms  exist.  The  BCH  codes  have  become  the  most  important  and 
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widely  used  linear  cyclic  codes.  Most  examples  of  I*CH  codes  are  binary, 
but  the  alphabet  can  be  elements  from  any  arbitrary  Galois  field  GF(q). 

For  these  BCH  codes  it  is  possible  to  specify  a  block 
length  N  (usually  N  =  2™  -1)  and  a  minimum  distance  d(d  <.  N)  and  choose 
a  generator  : aatrix  to  produce  a  code  with  the  specified  length  and  distance. 
For  lengths  up  to  1023,  the  BCH  codes  have  rates  which  meet  or  exceed 
the  Gilbert  bound,  although  as  N  approaches  infinity  they  fail  to  do  so. 
b.  Generation 

Suppose  a  block  length  of  15  (m  =  4)  and  a  minimum 
distance  of  5  was  desired  (ability  to  correct  tw  j  errors).  For  «  a  primitive 
element  of  GF(2^  ),  the  generator  polynomial  for  this  desired  code  can  be 
calculated  by  taking  the  product  of  the  minimum  polynomials  for  d-1 
consecutive  powers  of  °c  .  (Refer  to  Gallager  |"3j  page  233  tor  a  brief 
list  of  minimal  polynomials.)  Calculation  ofG(D)  by  this  method  yields 
G(D)  ~  (  Dvy-  D  + 1)  (Dv-t  0"+  D’+Di-O  -  De+  D‘+  oV/  . 

Since  the  generator  polynomial  is  of  degree  N-K,  N-K=8,  K=7,  and  the 
code  is  a  (15,7)  BCI  code.  A  possible  generator  matrix  for  this  code  is  a 
matrix  whose  first  row  is  the  code  vector  corresponding  to  the  generator 
polynomial  G(D),  Since  G(D)  -  D8  +  D7  +  D®  +  D4  +  1,  the  first  row  of 
the  generator  matrix  could  be  000  000  111  010  001.  The  remaining  K-l 
rows  of  the  generator  matrix  could  be  the  K-l  or  7  cyclic  shifts  of  the  first 
row . 
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4. 


Interleaving 


a.  General 

A  simple  and  often  used  technique  to  combat  burst 
errors  on  a  channel  is  the  use  of  an  interleaver.  The  principle  is  to 
separate  successive  digits  within  a  codeword  by  a  certain  time  interval 
so  that  burst  errors  on  the  channel  will  not  appear  successively  in  the 
codeword.  If  the  Interleaving  achieved  a  separation  of  B  bits,  a  burst 
of  B  errors  would  cause  one  error  to  appear  in  each  codeword.  This 
technique  distributes  the  channel  burst  errors  in  a  pseudo-random  manner 
and  gives  the  decoder  an  opportunity  to  correct  an  otherwise  uncorrect- 
able  burst  error  pattern,  at  the  possible  expense  of  making  more  decoding 
errors .  The  two  most  common  Interleavers  are  the  block  interleaver  and 
the  periodic  (or  convolutional)  interleaver. 

b.  Block  Interleavers 

Block  interleavers  are  the  most  common  type  of  inter¬ 
leavers  and  the  interleaving  is  usually  accomplished  by  storing  encoded 
codewords  bits  in  the  rows  of  a  BxN  matrix  and  then  reading  out  these  bits 
by  columns  prior  to  their  transmission  across  the  channel.  This  produces 
a  separation  of  B  bits  between  adjacent  bits  of  the  codeword  when  it  transits 
the  channel.  The  longer  the  degree  of  interleaving,  the  more  storage 
required  and  the  longer  time  delay  from  the  encoding  of  a  word  until  it  is 
actually  transmitted  across  the  channel.  The  received  bits  are  deinter- 
leaved  prior  to  their  decoding. 
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c.  Periodic  Interleaver 

A  periodic  (or  convolutional)  BXN  interleaver  achieves 
interleaving  by  arranging  the  codeword  symbols  in  blocks  of  N  and  delay¬ 
ing  the  i  th  symbol  in  each  block  by  (i-l)  B  time  units.  The  delay  is 
accomplished  using  a  (i-l)  stage  shift-register  clocked  once  every  N 
symbol  times,  where  B^  =  B/N, 

At  the  receiver,  symbols  are  reblocked  in  groups  of  N  by 
the  deinterleaver  and  the  i  th  symbol  in  each  block  is  now  delayed  by 
(N-i)  B  times  units  using  a  (N-l)  B*  state  shift-register. 

The  result  of  this  interleaving  and  deinterleaving  is  to 
delay  all  symbols  by  (N-l)  B*  time  units  and  separate  adjacent  codeword 
symbols  by  B  time  units .  A  single  channel  burst  of  B  or  fewer  time  units 
will  affect  only  one  of  the  N  oeinterleaver  output  streams  at  a  time. 
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IV.  DECODING 


A.  GENERAL 

1 .  The  Decoding  Problem 

The  basic  problem  of  the  block  code  decoder  is  to  chose  the 
correct  codeword  transmitted  from  the  set  of  2  possible  codeword  N- 
tuples  which  could  have  been  transmitted  given  that  a  certain  N-tuple  (y) 
was  received.  There  are  a  number  of  possible  ways  the  decoder  could 
make  the  required  choice.  Two  decoding  methods  are  maximum  likeli¬ 
hood  decoding  and  minimum  distance  decoding. 

2  .  Maximum  Likelihood  Decoding 
Let 

X  =  [x,  ,  X,,  ...  x] 

denote  a  transmitted  codeword  and 

x  =  [y. »  y» .  •••  y^J 

the  N-tuple  received  by  the  decoder.  Given  a  certain  y  ^  has  been 
received  the  maximum  likelihood  decoder  chooses  a  x  £  one  of  the  set  of 
2  possible  codewords  such  that  the  probability  Pn  ( /'XXr,)  Is 
maximized.  To  accomplish  the  proper  choice  of  x  ^  the  decoder  must 
calculate  the  probability  of  y  ^  for  each  of  the  2^  possible  codewords 
which  could  have  been  transmitted.  Since  each  of  these  2  probability 
calculations  takes  time,  maximum  likelihood  decoding  is  not  really  prac¬ 


tical  for  long  codes. 
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The  minimum-error  probability  decoding  rule  is  the 
decoding  rule  which  minimizes  the  probability  of  decoding  error  for  a  given 
message  ensemble  of  codewords.  For  discrete  memoryless  channels  the 
decoder  minimizes  the  probability  of  error  by  choosing  axMso  that  the 
probability  of  that  x  m  conditioned  on  the  received  sequence  (y)  is  largest. 
If  all  of  the  2^  codewords  are  equally  likely  (which  is  usually  assumed), 
it  can  be  shown  that  maximum  likelihood  decoding  is  equivalent  to  minimum- 
error  probability  decoding , 

3 .  Minimum  Distance  Decoding 

Given  any  two  binary  N-tuples  the  distance  between  them  is 
defined  as  the  number  of  positions  in  which  the  two  sequences  differ.  The 
distance  between  a  transmitted  codeword  x  and  a  received  sequence  y 
is  therefore 

* 

:l)  =  E  S  (x;.  y*  ) 

A,*-  I 

where 

&  =  1  if  x  /  y  otherwise  =0. 

t 

As  previously  stated,  the  minimum  distance  is  a  rough  measure  of  a  code’s 
error  correcting  and  detecting  ability.  A  minimum  distance  of  dj  guarantees 
the  ability  to  correct  at  least  — 1  -  —  errors.  bc,^)  is  the  number 

of  errors  that  have  occurred  in  the  channel.  For  memory  less  channels 

.  -of  (2,3-)  ,  v  /v-et 

Pr  (x/x)  =  ft  ^  il-fi) 

where  Pr  denotes  the  probability  and  p  is  the  probability  of  a  digit 
being  in  error.  Since  ff  is  always  assumed  less  then  1/2  the  Pr  (y/x) 
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Increases  as  d(y,x)  decreases.  The  decoder  which  minimizes  d(y,x) 
always  maximizes  P(y/x).  On  the  binary  symmetric  channel  with  equally 
likely  codewords,  the  "minimum  distance"  decoder  is  equivalent  to  a 
maximum  likelihood  decoder. 

B.  DECODING  ON  CHANNELS  WITH  MEMORY 
1 .  General 

The  concept  of  maximum  likelihood  decoding  applies  10 
channels  with  memory,  as  well  as  to  memoryless  channels.  Maximum 
likelihood  decoding,  however,  cannot  be  implemented  on  a  channel  with 
memory  by  using  a  minimum  distance  decoding  rule.  The  usual  decoding 
strategy  for  channels  with  memory  is  to  map  each  syndrome  into  the  error 
pattern  which  is  the  shortest  burst  that  could  cause  that  syndrome. 
Decoders  which  are  optimum  in  this  sense  are  known  [ 3 ]  ,  but  these 

decoders  are  optimum  in  the  sense  that  they  have  the  lowest  probability 
of  error  only  if  a  short  burst  is  always  more  likely  than  a  longer  burst. 

In  order  to  obtain  some  quantitative  evaluation  of  maximum 
likelihood  burst  decoding  a  simulation  of  two  codes  using  a  Gilbert 
channel  model  was  performed  as  is  oescribed  in  the  following  sections. 


2. 


7.3  Maximal  Length  Code 


A  (7,3)  maximal  length  code  was  chosen  because  it  has  a 
short  block  length,  relatively  large  minimum  distance  (4),  and  is  easily 
constructed.  The  code  has  16  syndromes  and  8  error  sequences  are  solu¬ 
tions  to  S  =  eH  r  for  each  syndrome.  Since  the  minimum  distance  of  4 
gave  a  capability  to  correct  all  single  errors,  all  single  error  sequences 
and  the  zero  error  sequence  were  assumed  to  be  the  most  likely  error 
sequence  for  their  corresponding  syndrome,  since  no  two  of  these  correct¬ 
able  error  sequences  woe  in  the  same  syndrome.  Seven  of  the  remaining 
eight  syndromes  contained  3  weight  two  error  sequences .  The  remaining 
syndrome  contained  7  weight  three  error  sequences  and  the  burst  of  length  seven. 

The  parameters  P,  p,  and  h  of  the  partitioned  3  state  Gilbert 
model  were  varied  and  error  sequence  probabilities  were  calculated  for 
the  zero  error  sequence,  the  seven  single  error  sequences,  and  the  64 
error  sequences  of  the  remaining  8  syndromes,  using  the  method  described 
in  Chapter  II . 

A  maximum  likelihood  decoding  error  pi  ability,  P(E),  was 
calculated  by  summing  the  probabilities  of  the  most  likely  error  sequence 
in  each  of  the  16  syndromes  and  subtracting  this  cumulative  sum  from  one. 
The  results  of  these  error  sequence  probability  and  maximum  likelihood 
decoding  calculations  are  as  follows: 

(a)  The  most  likely  error  sequence  for  each  syndrome 
was  the  burst  pattern  of  minimum  length.  Error  sequences 
of  the  same  weight  were  not  in  general  equally  likely  using 
the  Gilbert  mode*  but  were  dependent  on  the  parameters  P, 


p ,  and  h . 
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(b)  The  probability  of  decoding  error,  P(E),  increased 
as  the  channel  model  was  made  more  noisy  by  increasing 
p  or  h  or  by  decreasing  P  (see  Fig.  9 ,  10,  and  11). 

(c)  The  probability  of  error  is  influenced  more  by 
P,  the  probability  of  transition  from  the  burst  state, 
than  by  the  parameters  p  or  h  of  the  Gilbert  model  (see 
Fig.  9,  10,  and  11). 

(d)  Finally,  a  binary  symmetric  channel  was  modeled 
by  letting  P  =  1-p  and  h  =  1.  With  this  model,  all 
error  sequences  of  equal  weight  were  equally  likely,  and 
the  most  likely  sequences  were  the  ones  of  least  weight. 

The  probability  of  decoding  error,  P(E),  increased  as  the 
model  was  made  more  noisy. 

Since  this  code  has  a  very  limited  number  of  syndromes,  its 
ability  to  correct  long  bursts  was  limited.  This  suggested  a  code  of  longer 
length  with  greater  burst  correction  capability  should  be  investigated. 

t 

3  .  (15.7)  BCH  Code 

A  (15,7)  BCH  code  was  constructed  as  described  in  Chapter 
III.  This  code  has  2*  syndromes  and  2?  error  sequences  are  solutions 
to  S  =  eH  for  each  syndrome.  Since  it  was  impractical  to  calculate 
all  possible  error  sequence  probabilities  as  was  done  for  the  (7,3)  maxi¬ 
mal  length  code,  another  method  had  to  be  used. 

An  error  sequence  of  interest  was  hosen.  All  error  sequences 
which  have  the  same  syndrome  as  the  chosen  error  sequence,  were 
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generated  by  the  addition  (modulo  2)  of  this  error  sequence  to  every  code¬ 
word  of  this  (15,7)  BCH  code.  The  probability  d  each  of  these  sequences 
was  then  calculated  (as  described  in  Chapter  III)  using  the  partitioned 
three  state  Gilbert  model.  The  results  of  these  calculations  were  as 
follows: 

(a)  Error  sequences  of  equal  weight  are  not  in  general 
equally  likely  and  the  selection  of  the  Gilbert  model 
parameters  determines  which  one  of  the  equal  weight 
patterns  is  the  most  likely. 

(b)  The  most  likely  error  sequence  for  a  particular 
syndrome  is  not  always  the  error  sequence  of  least 
weight  or  the  burst  error  sequence  of  shortest  length. 

Figute  12  shows  the  probability  of  three  different  error 
sequences  as  h,  the  burst  state  error  probability,  is 
varied.  For  values  of  h  between  1.0  and  .75,  a  solid 
burst  of  length  6  is  the  most  likely  sequence.  For  values 
of  h  between  .75  and  .39,  a  burst  of  length  5  is  the 
most  likely  sequence.  When  h  is  less  than  .39,  the 
minimum  weight  sequence  cf  weight  three  is  the  most 
likely, 
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V.  CONCLUSIONS 


Although  many  better  and  much  more  complicated  channel  models 
exist,  a  bursty  channel  can  be  modeled  using  the  simple  partitioned  3 
state  Gilbert  model.  Error  sequence  probabilities  can  be  easily  calculated 
using  this  model.  These  calculations  show  for  some  codes  error  sequences 
of  the  same  weight  are  not  equally  likely  and  burst  error  sequences  of 
smaller  length  may  not  be  as  likely  os  longer  burst  error  sequences. 

The  optimum  burst  decoder,  as  proposed  by  Gallager  [3j  , 

which  always  chooses  the  burst  error  of  smallest  length  as  the  most  likely 
error  sequence,  is  not  optimum  in  the  sense  of  having  an  error  probability 
as  low  as  a  maximum  likelihood  decoder.  A  minimum  weight  decoder  like¬ 
wise  is  also  not  a  maximum  likelihood  decoder  for  this  model.  This 
suggests  that  a  decoder  having  the  minimum  probability  of  error  for  the 
bursty  channel  cannot  be  easily  constructed. 

If  it  is  possible  to  model  a  real  channel  using  a  finite  number  of 
states  involving  a  Markov  chain,  it  is  then  possible  to  calculate  error 
sequence  probabilities  and  choose  the  type  of  decoder  required  to  give  the 
lowest  probability  of  decoding  error. 

A  common  technique  for  combatting  burst  errors  has  been  to  use  an 
interleaver  to  scatter  burst  channel  errors  in  a  pseudo-random  manner. 

The  rationale  behind  this  technique  is  if  the  degree  of  interleaving  is  large 
enough,  the  burst  errors  will  be  sufficiently  scattered  so  that  the  channel 
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can  be  treated  as  memoryless.  Since  the  errors  now  seem  to  occur  In  a 
random  manner,  a  code  with  good  random  correcting  ability  is  sometimes 
used  in  conjunction  with  a  minimum  distance  decoder.  The  channel  burst 
errors  are  not  purely  random  but  are  distributed  systematically  in  accord¬ 
ance  with  the  interleaver  used.  Purely  random  error  correction  does  not 
use  information  contained  between  interleaved  codewords  about  how  errors 
are  distributed.  Interleaving  does  enable  a  code  to  correct  otherwise 
uncorrectable  long  burst  errors.  This  suggests  that  a  better  burst  error 
correction  technique  would  be  to  use  an  interleaver  but  also  use  a  decoding 
rule  which  would  use  the  information  contained  between  interleaved  code¬ 


words  to  aid  in  burst  error  correction. 
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Figure  1.  Block  Diagram  of  a  General  Communication  System. 
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Figure  2.  Block  Diagram  of  a  General  Communication  System 
with  a  Discrete  Memoryless  Source  Assumed. 
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Figure  3 .  The  Binary  Symmetric  Channel 
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Figure  6.  Asymptotic  Error  Correction  Bounds. 
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Figure  7.  Encoder  for  a  Maximal- j^ength  Code 
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PSEQ  (Error  Sequence  Probability) 
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